NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

DIFFUSE: predicting isoform functions from sequences and expression profiles via deep learning

https://doi.org/10.1093/bioinformatics/btz367

Chen, Hao; Shaw, Dipan; Zeng, Jianyang; Bu, Dongbo; Jiang, Tao (July 2019, Bioinformatics)

Abstract MotivationAlternative splicing generates multiple isoforms from a single gene, greatly increasing the functional diversity of a genome. Although gene functions have been well studied, little is known about the specific functions of isoforms, making accurate prediction of isoform functions highly desirable. However, the existing approaches to predicting isoform functions are far from satisfactory due to at least two reasons: (i) unlike genes, isoform-level functional annotations are scarce. (ii) The information of isoform functions is concealed in various types of data including isoform sequences, co-expression relationship among isoforms, etc. ResultsIn this study, we present a novel approach, DIFFUSE (Deep learning-based prediction of IsoForm FUnctions from Sequences and Expression), to predict isoform functions. To integrate various types of data, our approach adopts a hybrid framework by first using a deep neural network (DNN) to predict the functions of isoforms from their genomic sequences and then refining the prediction using a conditional random field (CRF) based on co-expression relationship. To overcome the lack of isoform-level ground truth labels, we further propose an iterative semi-supervised learning algorithm to train both the DNN and CRF together. Our extensive computational experiments demonstrate that DIFFUSE could effectively predict the functions of isoforms and genes. It achieves an average area under the receiver operating characteristics curve of 0.840 and area under the precision–recall curve of 0.581 over 4184 GO functional categories, which are significantly higher than the state-of-the-art methods. We further validate the prediction results by analyzing the correlation between functional similarity, sequence similarity, expression similarity and structural similarity, as well as the consistency between the predicted functions and some well-studied functional features of isoform sequences. Availability and implementationhttps://github.com/haochenucr/DIFFUSE. Supplementary informationSupplementary data are available at Bioinformatics online.
more » « less
NeoDTI: neural integration of neighbor information from a heterogeneous network for discovering new drug–target interactions

https://doi.org/10.1093/bioinformatics/bty543

Wan, Fangping; Hong, Lixiang; Xiao, An; Jiang, Tao; Zeng, Jianyang; Wren, ed., Jonathan (July 2018, Bioinformatics)

Abstract MotivationAccurately predicting drug–target interactions (DTIs) in silico can guide the drug discovery process and thus facilitate drug development. Computational approaches for DTI prediction that adopt the systems biology perspective generally exploit the rationale that the properties of drugs and targets can be characterized by their functional roles in biological networks. ResultsInspired by recent advance of information passing and aggregation techniques that generalize the convolution neural networks to mine large-scale graph data and greatly improve the performance of many network-related prediction tasks, we develop a new nonlinear end-to-end learning model, called NeoDTI, that integrates diverse information from heterogeneous network data and automatically learns topology-preserving representations of drugs and targets to facilitate DTI prediction. The substantial prediction performance improvement over other state-of-the-art DTI prediction methods as well as several novel predicted DTIs with evidence supports from previous studies have demonstrated the superior predictive power of NeoDTI. In addition, NeoDTI is robust against a wide range of choices of hyperparameters and is ready to integrate more drug and target related information (e.g. compound–protein binding affinity data). All these results suggest that NeoDTI can offer a powerful and robust tool for drug development and drug repositioning. Availability and implementationThe source code and data used in NeoDTI are available at: https://github.com/FangpingWan/NeoDTI. Supplementary informationSupplementary data are available at Bioinformatics online.
more » « less
FreePSI: an alignment-free approach to estimating exon-inclusion ratios without a reference transcriptome

https://doi.org/10.1093/nar/gkx1059

Zhou, Jianyu; Ma, Shining; Wang, Dongfang; Zeng, Jianyang; Jiang, Tao (November 2017, Nucleic Acids Research)

Full Text Available
TITER: predicting translation initiation sites by deep learning

https://doi.org/10.1093/bioinformatics/btx247

Zhang, Sai; Hu, Hailin; Jiang, Tao; Zhang, Lei; Zeng, Jianyang (July 2017, Bioinformatics)

Full Text Available
DeepHINT: understanding HIV-1 integration via deep learning with attention

https://doi.org/10.1093/bioinformatics/bty842

Hu, Hailin; Xiao, An; Zhang, Sai; Li, Yangyang; Shi, Xuanling; Jiang, Tao; Zhang, Linqi; Zhang, Lei; Zeng, Jianyang; Berger, ed., Bonnie (October 2018, Bioinformatics)

Abstract MotivationHuman immunodeficiency virus type 1 (HIV-1) genome integration is closely related to clinical latency and viral rebound. In addition to human DNA sequences that directly interact with the integration machinery, the selection of HIV integration sites has also been shown to depend on the heterogeneous genomic context around a large region, which greatly hinders the prediction and mechanistic studies of HIV integration. ResultsWe have developed an attention-based deep learning framework, named DeepHINT, to simultaneously provide accurate prediction of HIV integration sites and mechanistic explanations of the detected sites. Extensive tests on a high-density HIV integration site dataset showed that DeepHINT can outperform conventional modeling strategies by automatically learning the genomic context of HIV integration from primary DNA sequence alone or together with epigenetic information. Systematic analyses on diverse known factors of HIV integration further validated the biological relevance of the prediction results. More importantly, in-depth analyses of the attention values output by DeepHINT revealed intriguing mechanistic implications in the selection of HIV integration sites, including potential roles of several DNA-binding proteins. These results established DeepHINT as an effective and explainable deep learning framework for the prediction and mechanistic study of HIV integration. Availability and implementationDeepHINT is available as an open-source software and can be downloaded from https://github.com/nonnerdling/DeepHINT. Supplementary informationSupplementary data are available at Bioinformatics online.
more » « less
Analysis of Ribosome Stalling and Translation Elongation Dynamics by Deep Learning

https://doi.org/10.1016/j.cels.2017.08.004

Zhang, Sai; Hu, Hailin; Zhou, Jingtian; He, Xuan; Jiang, Tao; Zeng, Jianyang (September 2017, Cell Systems)

Full Text Available
Reconstructing spatial organizations of chromosomes through manifold learning

https://doi.org/10.1093/nar/gky065

Zhu, Guangxiang; Deng, Wenxuan; Hu, Hailin; Ma, Rui; Zhang, Sai; Yang, Jinglin; Peng, Jian; Kaplan, Tommy; Zeng, Jianyang (February 2018, Nucleic Acids Research)

Full Text Available
A network integration approach for drug-target interaction prediction and computational drug repositioning from heterogeneous information

https://doi.org/10.1038/s41467-017-00680-8

Luo, Yunan; Zhao, Xinbin; Zhou, Jingtian; Yang, Jinglin; Zhang, Yanqing; Kuang, Wenhua; Peng, Jian; Chen, Ligong; Zeng, Jianyang (December 2017, Nature Communications)

Full Text Available
A deep boosting based approach for capturing the sequence binding preferences of RNA-binding proteins from high-throughput CLIP-seq data

https://doi.org/10.1093/nar/gkx492

Li, Shuya; Dong, Fanghong; Wu, Yuexin; Zhang, Sai; Zhang, Chen; Liu, Xiao; Jiang, Tao; Zeng, Jianyang (May 2017, Nucleic Acids Research)

Full Text Available

Search for: All records